Are Latent Sentence Vectors Cross-Linguistically Invariant?
نویسنده
چکیده
Previous work [Bowman et al., 2016] has shown that variational autoencoders (VAEs) can create distributed representations of natural language that capture different linguistic levels such as syntax, semantics, and style in a holistic manner. I investigate to what extent VAEs, when trained on different languages, result in comparable representations. To this end, I train VAEs for English and French, and then train a transformation between the resulting latent spaces on the task of machine translation. An analysis of the resulting mapping from French to English sentences shows that the latent representations represent the presence of words, phrases, and the general topic. However, I do not find evidence that they also encode syntax and semantics in a cross-linguistically invariant manner.
منابع مشابه
Hierarchical Text Generation and Planning for Strategic Dialogue
End-to-end models for strategic dialogue are challenging to train, because linguistic and strategic aspects are entangled in latent state vectors. We introduce an approach to generating latent representations of dialogue moves, by inducing sentence representations to maximize the likelihood of subsequent sentences and actions. The effect is to decouple much of the semantics of the utterance fro...
متن کاملTCDSCSS: Dimensionality Reduction to Evaluate Texts of Varying Lengths - an IR Approach
This paper provides system description of the cross-level semantic similarity task for the SEMEVAL-2014 workshop. Crosslevel semantic similarity measures the degree of relatedness between texts of varying lengths such as Paragraph to Sentence and Sentence to Phrase. Latent Semantic Analysis was used to evaluate the cross-level semantic relatedness between the texts to achieve above baseline sco...
متن کاملSOLUTION-SET INVARIANT MATRICES AND VECTORS IN FUZZY RELATION INEQUALITIES BASED ON MAX-AGGREGATION FUNCTION COMPOSITION
Fuzzy relation inequalities based on max-F composition are discussed, where F is a binary aggregation on [0,1]. For a fixed fuzzy relation inequalities system $ A circ^{F}textbf{x}leqtextbf{b}$, we characterize all matrices $ A^{'} $ For which the solution set of the system $ A^{' } circ^{F}textbf{x}leqtextbf{b}$ is the same as the original solution set. Similarly, for a fixed matrix $ A $, the...
متن کاملThree Sensitive Positions and Chinese Complex Sentences: A Comparative Perspective
The positioning of sentential connectives in Chinese complex sentences is more flexible than their counterparts in English. Sentential connectives in Chinese can be placed in three sensitive positions: clause-initial, predicate-initial, and clause-final positions. Due to the co-existence of prepositions and postpositions in the language, sentential connectives can be placed in both clause-initi...
متن کاملThe Role of Conceptualizable Agent in Overpassivization of English Unaccusatives in Iranian English Majors
The present study is an attempt to explore the effect of one of the pragmatic elements of discourse (namely the conceptualizable agent) on overpassivization of English unaccusative verbs. Through employing the questionnaire originally used by Ju, (2000), 206 Iranian intermediate and advanced English majors were asked to choose the more grammatical form (active or passive) in target sentences wi...
متن کامل